HFSP: The Hadoop Fair Sojourn Protocol

نویسندگان

  • Mario Pastorelli
  • Antonio Barbuzzi
  • Damiano Carra
  • Pietro Michiardi
چکیده

This work presents the HFSP scheduler, which implements a size-based scheduling discipline for Hadoop. While the benefits of size-based scheduling disciplines are well recognized in a variety of contexts (computer networks, operating systems, etc...), their practical implementation for a system such as Hadoop raises a number of important challenges. In HFSP we address issues related to job size estimation, resource management and study the effects of a variety of preemption strategies. Although the architecture underlying HFSP is suitable for any size-based scheduling discipline, in this work we revisit and extend the Fair Sojourn Protocol, which solves many problems related to job starvation that affect FIFO, Processor Sharing and a range of size-based disciplines. Our experiments, in which we compare HFSP to standard Hadoop schedulers, pinpoint at a significant decrease in average job sojourn times – a metric that accounts for the total time a job spends in the system, including waiting and serving times – for realistic workloads that we generate according to production workload traces available in the literature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical Size-based Scheduling for MapReduce Workloads

We present the Hadoop Fair Sojourn Protocol (HFSP) scheduler, which implements a size-based scheduling discipline for Hadoop. The benefits of size-based scheduling disciplines are well recognized in a variety of contexts (computer networks, operating systems, etc...), yet, their practical implementation for a system such as Hadoop raises a number of important challenges. With HFSP, which is ava...

متن کامل

Size-based disciplines for job scheduling in data-intensive scalable computing systems. (Disciplines basées sur la taille pour la planification des jobs dans data-intensif scalable computing systems)

The past decade have seen the rise of data-intensive scalable computing (DISC) systems, such as Hadoop, and the consequent demand for scheduling policies to manage their resources, so that they can provide quick response times as well as fairness. Schedulers for DISC systems are usually focused on the fairness, without optimizing the response times. The best practices to overcome this problem i...

متن کامل

Fairness and Efficiency in Processor Sharing Protocols to Minimize Sojourn Times

We consider the problem of designing a preemptive protocol that is both fair and efficient when one is only concerned with the sojourn time of the job and not intermediate results. Our Fair Sojourn Protocol (FSP) is both efficient, in a strong sense (similar to the shortest remaining processing time protocol – SRPT), and fair, in the sense of guaranteeing that it outperforms processor sharing (...

متن کامل

Efficient fair algorithms for message communication

A computer network serves distributed applications by communicating messages between their remote ends. Many such applications desire minimal delay for their messages. Beside this efficiency objective, allocation of the network capacity is also subject to the fairness constraint of not shutting off communication for any individual message. Processor Sharing (PS) is a de facto standard of fairne...

متن کامل

Preemptive ReduceTask Scheduling for Fair and Fast Job Completion

Hadoop MapReduce adopts a two-phase (map and reduce) scheme to schedule tasks among data-intensive applications. However, under this scheme, Hadoop schedulers do not work effectively for both phases. We reveal that there exists a serious fairness issue among jobs of different sizes, leading to prolonged execution for small jobs, which are starving for reduce slots held by large jobs. To solve t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1302.2749  شماره 

صفحات  -

تاریخ انتشار 2012